CDS 6324 - Data Visualization

Lecture 1: Introduction

1. Data Explosion

🧠 Memory Trick:

Data is growing exponentially, but human attention is NOT. The problem today is not lack of information, but information overload.

2. Herbert Simon's Idea

"A wealth of information creates a poverty of attention."
More information = Less attention available for each piece of information.
Exam Keyword:
Attention Scarcity

3. What is Visualization?

Data Visualization is the process of converting data into visual forms so people can understand patterns, trends and outliers more easily.
Researcher Main Idea
McCormick (1987) See the unseen
Stuart Card (2007) Amplify cognition
Modern Definition Find patterns, trends and outliers

4. Goals of Visualization

  1. Show the data
  2. Encourage comparison
  3. Avoid distortion
  4. Present many numbers in small space
  5. Make large datasets understandable
  6. Reveal multiple levels of detail
  7. Serve a clear purpose
🧠 Remember: GOOD visualization = Clarity
BAD visualization = Decoration

5. Why Create Visualizations?

Purpose Description
Record Information Store observations and measurements
Support Reasoning Analyze patterns and make decisions
Communicate Information Share findings and persuade others
🧠 Exam Shortcut:

RRC
Record → Reason → Communicate

6. Case Study: Challenger Disaster

Engineers possessed the data showing O-ring failures increased at low temperatures.
The issue was not missing data.
The issue was poor visualization.
Lesson: Bad visualization can lead to bad decisions.

7. Case Study: John Snow Cholera Map

Cholera deaths were plotted on a map.
Cases clustered around the Broad Street pump.
🧠 Remember:
Map → Pattern → Hypothesis → Discovery

8. Case Study: Florence Nightingale

Created the Coxcomb Diagram during the Crimean War.
Most deaths came from disease rather than combat.
Visualization successfully persuaded leaders to improve sanitation.

9. Anscombe's Quartet

Different datasets can have identical statistics but very different visual appearances.
🧠 Golden Rule:

NEVER trust summary statistics alone. Always visualize your data.

10. Final Exam Summary

Most Important Points

  • Definition: Visual representation of data.
  • Goals: Show data, compare values, avoid distortion.
  • Three Purposes: Record, Reason, Communicate.
  • Anscombe: Same statistics ≠ Same data.
  • Challenger: Poor visualization can cause poor decisions.
  • John Snow: Visualization reveals patterns.
  • Nightingale: Visualization persuades people.